Determination of Syntactic Functions in Estonian Constraint Grammar

نویسنده

  • Kaili Müürisep
چکیده

This article describes the current state of syntactic analysis of Estonian using Constraint Grammar. Constraint Grammar framework divides parsing into two different modules: morphological disambiguation and determination of syntactic functions. This article focuses on the last module in detail. If the morphological disambiguator achieves the precision more than 85% and error rate is smaller than 2% then 80-88% of words becomes syntactically unambiguous. The error rate of parser is 1-4% depending on the ambiguity rate of input. The main goal of this work is to elaborate an efficient parser for Estonian and annotate the Corpus of Estonian Written Texts syntactically. It is the first attempt to write a parser for Estonian.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parsing Estonian with Constraint Grammar

This paper describes the current state of syntactic analysis of Estonian using Constraint Grammar, focusing mainly on the determination of syntactic functions. Constraint Grammar of Estonian was written in 1996-2000 at the University of Tartu. The author has developed its syntactic part.

متن کامل

Shallow Parsing of Spoken Estonian Using Constraint Grammar

In this paper we describe how we have adapted the syntactic analyzer of written Estonian to the spoken language. The Constraint Grammar shallow syntactic parser (Müürisep et al. 2003) was used for the automatic syntactic analysis of the corpus of Estonian spoken language (Hennoste et al. 2000). To adapt the parser, the clause boundary detection rules as well as some syntactic constraints had to...

متن کامل

A New Language for Constraint Grammar: Estonian∗

The Constraint Grammar of Estonian presented in the paper is the first attempt in automatic syntactic analysis of Estonian. The grammar consists of 1,240 morphological disambiguation rules, 47 clause boundary detection rules, 180 morphosyntactic mapping rules and 1,118 syntactic constraints. The rules have been devised using a training corpus of 20,300 words and have been tested on a benchmark ...

متن کامل

Parsing Manually Detected and Normalized Disfluencies in Spoken Estonian

An experiment with an Estonian Constraint Grammar based syntactic analyzer is conducted, analyzing transcribed speech. In this paper the problems encountered during parsing disfluencies are analyzed. In addition, the amount by which the manual normalization of disfluencies improved the results of recall and precision was compared to non-normalized utterances.

متن کامل

Syntactically annotated corpora of Estonian

Syntactically annotated corpora are needed 1) to train and test parsers and various language technological products grammar checkers, information retrievers and extractors, machine translators etc; 2) to check the agreement of existing linguistic theories with the real language usage. The corpora can be annotated on different levels of depth. In shallow syntactically annotated corpora a syntact...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999